Alibaba recently launched the new open-source voice model Qwen2-Audio, which excels in speech recognition, translation, and audio analysis, achieving significant performance improvements. Qwen2-Audio offers a basic version and an instruction fine-tuning version, supporting multiple languages such as Chinese, Cantonese, French, English, and Japanese, facilitating sentiment analysis and translation applications. Compared to Qwen-Audio, Qwen2-Audio features comprehensive optimizations in architecture and performance, utilizing more natural language prompts during the pre-training phase.